Skip to content

Conversation

@mtshiba
Copy link
Contributor

@mtshiba mtshiba commented Nov 29, 2025

Summary

from astral-sh/ty#1670

If a data structure that depends on salsa IDs is used as the key for an FxHashMap or the value of an FxHashSet, the output order of the iterator will be unstable. If a query depends on this order, the results of fixed-point iteration will be unstable.
In this case, Fx{Index, Order}{Map, Set} should be used, but it seems that this is not being done thoroughly.

Simply replacing all FxHash{Map, Set} with Fx{Index, Order}{Map, Set} would solve the problem, but it is generally believed that the former has slightly better performance if we are only using insertion and retrieval without using a set/map as an iterator.

Therefore, this PR proposes a compromise.
That is, replace all FxHash{Map, Set} used within ty with wrapper structs that does not implement (Into)Iterator, and instead define methods like unstable_iter to make users of these structs aware of whether iteration operations are safe.

After performing this refactoring, I discovered some suspicious parts. I hope this fix will help to resolve the issue.

Test Plan

@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 29, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 29, 2025

mypy_primer results

Changes were detected when running on open source projects
scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/_logging.py:153:13: warning[unsupported-base] Unsupported class base with type `<class 'Mapping[str, Style]'> | <class 'Mapping[str, Divergent]'>`
- Found 41 diagnostics
+ Found 42 diagnostics

pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
+ pandas-stubs/_typing.pyi:1209:16: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 5511 diagnostics
+ Found 5512 diagnostics

No memory usage changes detected ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 29, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.


let mut typevars = FxHashSet::default();
// We should use `FxIndexSet` here since `BoundTypeVarInstance::{valid, required}_specializations` is query-dependent.
let mut typevars = FxIndexSet::default();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably use FxIndexSet here.
The rest of this module seems okay to use unstable iterators, unless I'm overlooking something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more what query-dependent means?

Looking at the loop below, it doesn't seem to depend on ordering as it returns true only if all typevars satisfy the constraints and there's no state between the type var checking, as far as I can tell

Copy link
Contributor Author

@mtshiba mtshiba Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{valid, required}_specialization uses lazy_{bound, constraints} internally. I meant that the order in which these queries are called is non-deterministic.

@mtshiba mtshiba closed this Nov 29, 2025
@mtshiba mtshiba reopened this Nov 29, 2025
@mtshiba mtshiba closed this Nov 29, 2025
@mtshiba mtshiba reopened this Nov 29, 2025
@AlexWaygood AlexWaygood added the ty Multi-file analysis & type inference label Nov 29, 2025
@mtshiba
Copy link
Contributor Author

mtshiba commented Nov 29, 2025

Unfortunately, it appears that non-determinism remains due to factors entirely different from those considered in this PR.
However, that doesn't mean I think these changes are useless (they appear to improve performance slightly in benchmarks of large projects).

Anyway, I'm marking this PR as ready for review.

@mtshiba mtshiba marked this pull request as ready for review November 29, 2025 11:38
@MichaReiser
Copy link
Member

MichaReiser commented Nov 29, 2025

I like the approach, but I don't think using an IndexMap is the solution.

Iterating over two HashMaps, each created by inserting the same elements in the same order, will yield the same iteration order when run on the same platform.

The issue we see with fix point is that it's no longer guaranteed that the elements are inserted in the same order. However, we'll have the exact same issue when using IndexMap because IndexMap's iteration order is defined by insertion order, which isn't guaranteed to be deterministic within a cyclic query.

I'm not sure what the solution here is, other than applying some sort of sorting (somewhere?).

@MichaReiser
Copy link
Member

There's also one flaky diagnostic. What I suspect is that our convergence functions are now sensitive to which query is the outer-most cycle or some query that bails early as soon as it sees the first Divergent type (any type), and the any type is only visible depending on the cycle nesting

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed all the changes but I don't think the StableKey assumption is safe.

I do think it makes sense to have a hash map wrapper and this is also what rustc does https://github.com/rust-lang/rust/blob/ae90dcf0207c57c3034f00b07048d63f8b2363c8/compiler/rustc_data_structures/src/stable_map.rs#L45

Another solution (maybe less invasive?) could be to initialize the hashser function with a random state (instead of 0) in debug builds, so that iteration order is guaranteed to be different across runs (or have a feature flag that we can turn on)

@Gankra
Copy link
Contributor

Gankra commented Dec 1, 2025

re: randomizing layouts: if you want to go down that route I suggest cribbing from rustc's -Zrandomize-layout which lets you pass the seed in as a CLI argument (and print the seed any time one is selected so people can try to reproduce an issue).

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm leaning towards changing the hash maps in separate PRs and more explicitly talk about why the changes are necessary. I'm not convinced that it's necessary to use FxIndexMap in many cases.


let mut typevars = FxHashSet::default();
// We should use `FxIndexSet` here since `BoundTypeVarInstance::{valid, required}_specializations` is query-dependent.
let mut typevars = FxIndexSet::default();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more what query-dependent means?

Looking at the loop below, it doesn't seem to depend on ordering as it returns true only if all typevars satisfy the constraints and there's no state between the type var checking, as far as I can tell

/// List all members of a given type: anything that would be valid when accessed
/// as an attribute on an object of the given type.
pub fn all_members<'db>(db: &'db dyn Db, ty: Type<'db>) -> FxHashSet<Member<'db>> {
pub fn all_members<'db>(db: &'db dyn Db, ty: Type<'db>) -> FxIndexSet<Member<'db>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this anywhere outside the LSP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be used in ty_python_semantic as well (https://github.com/mtshiba/ruff/blob/stable-iteration/crates/ty_python_semantic/src/types/function.rs#L1464-L1467). In that case, however, an unstable iterator is not a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is just for internal use so that we can test the function itself in our mdtests; it's not a public-facing part of ty_python_semantic

@sharkdp sharkdp removed their request for review December 2, 2025 08:20
@carljm carljm removed their request for review December 3, 2025 06:46
@mtshiba mtshiba changed the title [ty] disallow using unstable iterators of FxHash{Map, Set} as query input [ty] replace FxHash{Map, Set} with footgun-mitigated APIs Dec 3, 2025
@mtshiba
Copy link
Contributor Author

mtshiba commented Dec 3, 2025

I think I'm leaning towards changing the hash maps in separate PRs and more explicitly talk about why the changes are necessary. I'm not convinced that it's necessary to use FxIndexMap in many cases.

I changed this PR to just replace FxHash{Map, Set} with new APIs. Therefore, this PR does not change any of the behavior of ty. Changes that affect behavior will be made after this PR (with reconsideration).

@mtshiba mtshiba requested a review from MichaReiser December 3, 2025 18:08
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating.

I'd be curious to get some more opinions on this. To me, this PR goes from one extreme to the other. I don't think we need to enforce this new API in all crates. E.g. I think it's completely fine to use the normal FxHashMap and IntoIterator in the project file discovery.

I'm leaning towards:

  • Giving those types a different name instead of overriding FxHashSet. I'm not entirely sure what to name them
  • Update our Contribution guidelines to state that these types should be used in ty_python_semantic with an explanation why it's important
  • Add the types to ruff_db

But maybe it's just me who dislikes the extra verbosity and others actually prefer to override FxHashSet and FxHashMap

@mtshiba
Copy link
Contributor Author

mtshiba commented Dec 4, 2025

The purpose of this PR is to make developers aware that using FxHash{Map, Set} as an iterator is inherently unstable, and in a sense, it intentionally introduces noisy APIs.
However, if you feel it's troublesome to continue using unstable_iter() in places where it is clearly OK to use it as an iterator, one option is to give FxHash{Map, Set} a tag type and change its behavior depending on the tag. In this case, there is no need to prepare two types of structs.

use std::collections::HashMap;
use std::collections::hash_map::IntoIter;

/// You are sure you can always use this hash map/set as an iterator.
#[derive(Default)]
pub struct ImplIntoIterator;
#[derive(Default)]
pub struct NoIntoIterator;

#[derive(Default)]
pub struct FxHashMap<K, V, Tag=NoIntoIterator>(HashMap<K, V>, std::marker::PhantomData<Tag>);

impl<K, V> IntoIterator for FxHashMap<K, V, ImplIntoIterator> {
    type Item = (K, V);
    type IntoIter = IntoIter<K, V>;

    fn into_iter(self) -> IntoIter<K, V> {
        self.0.into_iter()
    }
}

fn main() {
    let map1: FxHashMap<u32, u32> = FxHashMap::default();
    let map2: FxHashMap<u32, u32, ImplIntoIterator> = FxHashMap::default();
    
    for _i in map1 {} // ERR
    for _i in map2 {} // OK
}

Using FxHashMap<K, V, ImplIntoIterator> means that you are declaring that it can always be used as an iterator and there is no problem with that, and using FxHashMap<K, V, NoIntoIterator> means that you need to make sure it is OK to use before each iteration (if it is safe you can use unstable_iter()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants